Searching in document images: what does the appearance of a document tell us about what it means?

نویسندگان

  • Andrew Bagdanov
  • Marcel Worring
  • Arnold Smeulders
چکیده

The document understanding problem can be informally defined as the automatic extraction of meaning from documents. In the Intelligent Sensory Information Systems group we have experimented with analyzing the visual appearance of documents in order to extract meaning. That is, we concentrate on how documents look, rather than on what they say. We motivate this approach with several applications from document image understanding. First, we describe how document genre classification can be used to group visually similar documents together, which simplifies the analysis task for an entire class of documents. Second, we consider the logical block labeling problem. We show how logical labels (e.g. title, author, header, footer) can be assigned to blocks of text using a few visual features. Third, we discuss our approach to detecting the reading order of text using the visual structure of a document. The examples are based on the work in the field of content-based image retrieval (CBIR). Content-based image retrieval aims at searching and browsing image repositories on the basis of a visual specification of the query. The query may be one or preferably more examples, and the presentation may be a linear list of items, or prefarably a similarity grouping. Our research on colour representations of real world objects and colour composition of documents has shown that CBIR techniques can be successfully applied in order to simplify the document understanding problem. Searching in document images: what does the appearance of a document tell us about what it means?

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mistaking the Map for the Territory: What Society Does With Medicine; Comment on “Medicalisation and Overdiagnosis: What Society Does to Medicine”

Van Dijk et al describe how society’s influence on medicine drives both medicalisation and overdiagnosis, and allege that a major political and ethical concern regarding our increasingly interpreting the world through a biomedical lens is that it serves to individualise and depoliticize social problems. I argue that for medicalisation to serve this purpose, it would have to exclude the possibil...

متن کامل

The Natural Rights of Children

What does libertarian theory, Murray Rothbard’s theory in particular, tell us about the rights of children? The two foundational principles of Rothbardian libertarianism are the sanctity of private property and the rule of non-aggression. Persons, including children, are “self-owners”. Yet children, at a young age, are not yet capable of functioning fully as “self-owners.” They must be cared fo...

متن کامل

Learning Document Image Features With SqueezeNet Convolutional Neural Network

The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...

متن کامل

"What We Feel, and What Doth us Befall": A Study of Letter Motif in Macbeth 

The present essay is an attempt to scrutinize Macbeth's letter to Lady Macbeth formalistically with much care and seek hints which may lead us back and forth to understand what befell before and after the composition and emission of the letter. The letter seems to help us plunge into Macbeth's consciousness, and of course later to that of Lady Macbeth; it is a transparent aid to perceive the hi...

متن کامل

رفع اعوجاج هندسی متون به‌کمک اطلاعات هندسی خطوط متن

Document images produced by scanners or digital cameras usually have photometric and geometric distortions. If either of these effects distorts document, recognition of words from such a document image using OCR is subject to errors. In this paper we propose a novel approach to significantly remove geometric distortion from document images. In this method first we extract document lines from do...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001